454 research outputs found

    Learning to Behave:Reinforcement Learning in Human Contexts

    Get PDF
    Reinforcement learning (RL) has recently attracted significant attention with applications such as improving microchip designs, predicting the behaviour of protein structures and beating humanity’s best in the games of go, chess and Starcraft-II. These impressive and inspring successes show how RL can improve our lives, however, they have so far been seen mostly in settings that involve humans to a very limited extend. This thesis looks into the usage of RL in human contexts. First, we provide a systematic literature review of the usage of RL for personalisation, i.e. the adaptation of systems to individuals. Next, we show how RL can be used to personalise a conversational recommender system and find that it outperforms existing approaches, including a gold-standard and task-specific solutions in a simulation-based study. Since simulators may not be available for all conversational systems that could benefit from personalisation, we next look into the collection of user satisfaction ratings for dialogue data. We consolidate best practices in a UI for user satisfaction annotation and show that high-quality ratings can be obtained. Next, we look into the usage of RL for strategic workforce planning. Here, we find that RL is robust to the uncertainties that are an inherent part of this problem and that RL enables the specification of goals intuitive to domain experts. Having looked into these use-cases, we then turn toward the inclusion of safety constraints in RL. We propose how safety constraints from a medical guideline can be taken into account in an observational study on the optimisation of ventilator settings for ICU patients. Next, we look into safety constraints that contain a temporal component, we find that these may make the learning problem infeasible and propose a solution based on reward shaping to address this issue. Finally, we propose how RL can benefit from instructions that break a full task into smaller pieces based on the option framework and propose an approach for learning reusable behaviours from instructions to greatly reduce data requirements

    Collaborative Modeling of Processes: What Facilitation Support Does a Group Need?

    Get PDF
    Collaborative modeling of processes is increasingly being used in practice. However, collaborative modeling is difficult. To overcome the difficulties, a professional facilitator can be used. Collaboration Engineering takes up the challenge to design collaboration processes that do not need a professional facilitator, but can be facilitated by practitioners. This research contributes to this by identifying what facilitation aspects are important in collaborative modeling and which of these aspects can be transferred to practitioners. Three facilitation aspects are considered important: (1) guarding the rules of the modeling technique, (2) checking for completeness and (3) translating elements in reality to modeling concepts. The first facilitation aspect can be taken over by a tool that controls the rules of the modeling technique. The second facilitation aspect most likely can be taken over by the practitioner, but for the third aspect a professional with modeling expertise is require

    Low-Variance Policy Gradient Estimation with World Models

    Get PDF
    In this paper, we propose World Model Policy Gradient (WMPG), an approach to reduce the variance of policy gradient estimates using learned world models (WM's). In WMPG, a WM is trained online and used to imagine trajectories. The imagined trajectories are used in two ways. Firstly, to calculate a without-replacement estimator of the policy gradient. Secondly, the return of the imagined trajectories is used as an informed baseline. We compare the proposed approach with AC and MAC on a set of environments of increasing complexity (CartPole, LunarLander and Pong) and find that WMPG has better sample efficiency. Based on these results, we conclude that WMPG can yield increased sample efficiency in cases where a robust latent representation of the environment can be learned

    A Repeatable Collaboration Process for Developing a Road Map for Emerging New Technology Business: Case Mobile Marketing

    Get PDF
    The unique and little practiced characteristics of mobile as a marketing medium create a need to set up an action and research agenda regularly to foster the development of the mobile marketing value system. Numerous stakeholders take part in the network to deliver the mobile services. Strengthening their inter-organizational relationships is critical for the emerging value system to evolve. Our paper employs Collaboration Engineering to address this undertaking by designing a standard process that actors in mobile marketing, as well as in other emerging new technology businesses, can use to collaboratively develop a road map for the future. The first field test of this process was conducted in London in connection with the Mobile Marketing Summit ’04 organized by Nokia. The results are promising. Together with senior management of 25 leading brand marketers and advertising agencies we were able to outline an extensive road map while strengthening the network formation in the field

    Log Parsing Evaluation in the Era of Modern Software Systems

    Full text link
    Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing

    Log Parsing Evaluation in the Era of Modern Software Systems

    Get PDF
    Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing
    • …
    corecore